Python fundamentals

A quick introduction to the Python programming language and Jupyter notebooks. (We're using Python 3, not Python 2.)

Basic data types and the print() function


In [ ]:
# variable assignment
# https://www.digitalocean.com/community/tutorials/how-to-use-variables-in-python-3

# strings -- enclose in single or double quotes, just make sure they match
my_name = 'Cody'

# numbers
int_num = 6
float_num = 6.4

# the print function
print(8)
print('Hello!')
print(my_name)
print(int_num)
print(float_num)

# booleans
print(True)
print(False)
print(4 > 6)
print(6 == 6)
print('ell' in 'Hello')

Basic math

You can do basic math with Python. (You can also do more advanced math.)


In [ ]:
# addition
add_eq = 4 + 2

# subtraction
sub_eq = 4 - 2

# multiplication
mult_eq = 4 * 2

# division
div_eq = 4 / 2

# etc.

Lists

A comma-separated collection of items between square brackets: []. Python keeps track of the order of things inside a list.


In [ ]:
# create a list: name, hometown, age
# an item's position in the list is the key thing
cody = ['Cody', 'Midvale, WY', 32]

# create another list of mixed data
my_list = [1, 2, 3, 'hello', True, ['a', 'b', 'c']]

# use len() to get the number of items in the list
my_list_count = len(my_list)

print('There are', my_list_count, 'items in my list.')

# use square brackets [] to access items in a list
# (counting starts at zero in Python)

# get the first item
first_item = my_list[0]
print(first_item)

# you can do negative indexing to get items from the end of your list

# get the last item
last_item = my_list[-1]
print(last_item)

# Use colons to get a range of items in a list

# get the first two items
# the last number in a list slice is the first list item that's ~not~ included in the result
my_range = my_list[0:2]
print(my_range)

# if you leave the last number off, it takes the item at the first number's index and everything afterward
# get everything from the third item onward
my_open_range = my_list[2:]
print(my_open_range)

# Use append() to add things to a list
my_list.append(5)
print(my_list)

# Use pop() to remove items from the end of a list
my_list.pop()
print(my_list)

# use join() to join items from a list into a string with a delimiter of your choosing
letter_list = ['a', 'b', 'c']
joined_list = '-'.join(letter_list)
print(joined_list)

Dictionaries

A data structure that maps keys to values inside curly brackets: {}. Items in the dictionary are separated by commas. Python does not keep track of the order of items in a dictionary; if you need to keep track of insertion order, use an OrderedDict instead.


In [ ]:
my_dict = {'name': 'Cody', 'title': 'Training director', 'organization': 'IRE'}

# Access items in a dictionary using square brackets and the key (typically a string)
my_name = my_dict['name']
print(my_name)

# You can also use the `get()` method to retrieve values
# you can optionally provide a second argument as the default value
# if the key doesn't exist (otherwise defaults to `None`)
my_name = my_dict.get('name', 'Jefferson Humperdink')
print(my_name)

# Use the .keys() method to get the keys of a dictionary
print(my_dict.keys())

# Use the .values() method to get the values
print(my_dict.values())

# add items to a dictionary using square brackets, the name of the key (typically a string)
# and set the value like you'd set a variable, with =
my_dict['my_age'] = 32
print(my_dict)

# delete an item from a dictionary with `del`
del my_dict['my_age']
print(my_dict)

Commenting your code

Python skips lines that begin with a hashtag # -- these lines are used to write comments to help explain the code to others (and to your future self).

Multi-line comments are enclosed between triple quotes: """ """


In [ ]:
# this is a one-line comment

"""
This is a 
multi-line comment

~~~

"""

Comparison operators

When you want to compare values, you can use these symbols:

  • < means less than
  • > means greater than
  • == means equal
  • >= means greater than or equal
  • <= means less than or equal
  • != means not equal

In [ ]:
4 > 6

'Hello!' == 'Hello!'

(2 + 2) != (4 * 2)

100.2 >= 100

String functions

Python has a number of built-in methods to work with strings. They're useful if, say, you're using Python to clean data. Here are a few of them:

strip()

Call strip() on a string to remove whitespace from either side. It's like using the =TRIM() function in Excel.


In [ ]:
whitespace_str = '    hello!      '
print(whitespace_str.strip())

upper() and lower()

Call .upper() on a string to make the characters uppercase. Call .lower() on a string to make the characters lowercase. This can be useful when testing strings for equality.


In [ ]:
my_name = 'Cody'

my_name_upper = my_name.upper()
print(my_name_upper)

my_name_lower = my_name.lower()
print(my_name_lower)

replace()

Use .replace() to substitute bits of text.


In [ ]:
company = 'Bausch & Lomb'

company_no_ampersand = company.replace('&', 'and')

print(company_no_ampersand)

split()

Use .split() to split a string on some delimiter. If you don't specify a delimiter, it uses a single space as the default.


In [ ]:
date = '6/4/2011'

date_split = date.split('/')

print(date_split)

zfill()

Among other things, you can use .zfill() to add zero padding -- for instance, if you're working with ZIP code data that was saved as a number somewhere and you've lost the leading zeroes for that handful of ZIP codes that begin with 0.

Note: .zfill() is a string method, so if you want to apply it to a number, you'll need to first coerce it to a string with str().


In [ ]:
mangled_zip = '2301'
fixed_zip = mangled_zip.zfill(5)
print(fixed_zip)

num_zip = 2301
fixed_num_zip = str(num_zip).zfill(5)
print(fixed_num_zip)

slicing

Like lists, strings are iterables, so you can use slicing to grab chunks.


In [ ]:
my_string = 'supercalifragilisticexpialidocious'

chunk = my_string[9:20]

print(chunk)

startswith(), endswith() and in

If you need to test whether a string starts with a series of characters, use .startswith(). If you need to test whether a string ends with a series of characters, use .endswith(). If you need to test whether a string is part of another string -- or a list of strings -- use .in().

These are case sensitive, so you'd typically .upper() or .lower() the strings you're comparing to ensure an apples-to-apples comparison.


In [ ]:
str_to_test = 'hello'

print(str_to_test.startswith('hel'))
print(str_to_test.endswith('lo'))
print('el' in str_to_test)
print(str_to_test in ['hi', 'whatsup', 'salutations', 'hello'])

String formatting

Using curly brackets with the various options available to the .format() method, you can create string templates for your data. Some examples:


In [ ]:
# date in m/d/yyyy format
in_date = '8/17/1982'

# split out individual pieces of the date
# using a shortcut method to assign variables to the resulting list
month, day, year = in_date.split('/')

# reshuffle as yyyy-mm-dd using .format()
# use a formatting option (:0>2) to left-pad month/day numbers with a zero
out_date = '{}-{:0>2}-{:0>2}'.format(year, month, day)

print(out_date)

In [ ]:
# construct a greeting template
greeting = 'Hello, {}! My name is {}.'
your_name = 'Pat'
my_name = 'Cody'

print(greeting.format(your_name, my_name))

Type coercion

Consider:

# this is a number, can't do string-y things to it
age = 32

# this is a string, can't do number-y things to it
age = '32'

There are several functions you can use to coerce a value of one type to a value of another type. Here are a couple of them:

  • int() tries to convert to an integer
  • str() tries to convert to a string
  • float() tries to convert to a float

In [ ]:
# two strings of numbers
num_1 = '100'
num_2 = '200'

# what happens when you add them without coercing?
concat = num_1 + num_2
print(concat)

# coerce to integer, then add them
added = int(num_1) + int(num_2)
print(added)